Memorization in Deep Neural Networks: Does the Loss Function Matter?

نویسندگان

چکیده

Deep Neural Networks, often owing to the overparameterization, are shown be capable of exactly memorizing even randomly labelled data. Empirical studies have also that none standard regularization techniques mitigate such overfitting. We investigate whether choice loss function can affect this memorization. empirically show, with benchmark data sets MNIST and CIFAR-10, a symmetric as opposed either cross entropy or squared error results in significant improvement ability network resist then provide formal definition for robustness memorization theoretical explanation why losses robustness. Our clearly bring out role functions alone play phenomenon

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Detecting Learning vs Memorization in Deep Neural Networks using Shared Structure Validation Sets

Abstract: The roles played by learning and memorization represent an important topic in deep learning research. Recent work on this subject has shown that the optimization behavior of DNNs trained on shuffled labels is qualitatively different from DNNs trained with real labels. Here, we propose a novel permutation approach that can differentiate memorization from learning in deep neural network...

متن کامل

A Closer Look at Memorization in Deep Networks

We examine the role of memorization in deep learning, drawing connections to capacity, generalization, and adversarial robustness. While deep networks are capable of memorizing noise data, our results suggest that they tend to prioritize learning simple patterns first. In our experiments, we expose qualitative differences in gradient-based optimization of deep neural networks (DNNs) on noise vs...

متن کامل

Why Deep Neural Networks for Function Approximation?

Recently there has been much interest in understanding why deep neural networks are preferred to shallow networks. We show that, for a large class of piecewise smooth functions, the number of neurons needed by a shallow network to approximate a function is exponentially larger than the corresponding number of neurons needed by a deep network for a given degree of function approximation. First, ...

متن کامل

The loss surface and expressivity of deep convolutional neural networks

We analyze the expressiveness and loss surface of practical deep convolutional neural networks (CNNs) with shared weights. We show that such CNNs produce linearly independent features (and thus linearly separable) at every “wide” layer which has more neurons than the number of training samples. This condition holds e.g. for the VGG network. Furthermore, we provide for such wide CNNs necessary a...

متن کامل

The Loss Surface of Deep and Wide Neural Networks

While the optimization problem behind deep neural networks is highly non-convex, it is frequently observed in practice that training deep networks seems possible without getting stuck in suboptimal points. It has been argued that this is the case as all local minima are close to being globally optimal. We show that this is (almost) true, in fact almost all local minima are globally optimal, for...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Lecture Notes in Computer Science

سال: 2021

ISSN: ['1611-3349', '0302-9743']

DOI: https://doi.org/10.1007/978-3-030-75765-6_11